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AMENDMENTS TO THE CLAIMS 

Please enter the following amendments: 

1 . (Currently Amended) An apparatus for determining, based on speech waveform data, 
a portion representing a feature of the speech waveform, comprising: 

an acoustic/prosodic analysis unit which calculates, from said data, a distribution of 
energy of a prescribed frequency range of said speech waveform along a time axis, and extracts, 
among various syllables, a first portion of said speech waveform that is generated stably by a 
source of said speech waveform, based on the distribution of energy and pitch of said speech 
waveform; 

a cepstral analysis unit which calculates, from said data, a frequency spectrum 
distribution of said speech waveform along the time axis, and estimates, based on the frequency 
spectrum distribution, a second portion of said speech waveform, for which change is well 
controlled by said source; and 

a pseudo-syllabic center extracting unit which determines the portion representing the 
feature of said speech waveform based on the first portion extracted by the acoustic/prosodic 
analysis sonorant energy calculating unit and the second portion estimated by the cepstral 
analysis unit, wherein 

said cepstral analysis unit includes: 

a linear prediction analysis unit which performs linear prediction analysis on said 
speech waveform and outputting an estimated value of formant frequency; 
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a cepstral distance calculating unit which calculates, using said data, a distribution 
of cepstral distance on the time axis based on the estimated value of formant frequency provided 
by said linear prediction analysis unit; 

an inter-frame variance calculating unit which calculates, based on an output from 
said linear prediction analysis unit, distribution of local variance of magnitude of delta cepstrum 
of said speech waveform on the time axis; and 

a reliability center candidate output unit which estimates, based both on said 
distribution of cepstral distance on the time axis based on the estimated value of formant 
frequency calculated by said cepstral distance calculating unit and on said distribution on the 
time axis of local variance of magnitude of delta cepstrum of said speech waveform calculated 
by said inter-frame variance calculating unit, a range in which change in the speech waveform is 
well controlled by said source. 

2. (Previously Presented) The apparatus according to claim 1, wherein 
said acoustic/prosodic analysis unit includes: 

a pitch determining unit which determines, based on said data, whether each 
segment of said speech waveform is a voiced segment or not, 

a dip detecting unit which separates said speech waveform into syllables at a local 
minimum of said waveform of energy distribution of the prescribed frequency range of said 
speech waveform on the time axis; and 

a voiced/energy determining unit which extracts that range of said speech 
waveform which includes, in each syllable, an energy peak in that syllable within the segment 
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determined to be a voiced segment by said pitch determining unit and in which the energy of the 
prescribed frequency range is not lower than a prescribed threshold value. 



3. (Canceled) 



4. (Currently Amended) The apparatus according to claim 1, wherein 
said pseudo-syllabic center extracting unit includes: mean s for determining determines a 
range,, included in the range first portion of said speech waveform extracted by said 
acoustic/prosodic analysis unit, within the range of which change in said speech waveform is 
estimated by said cepstral analysis unit to be well controlled by said source. 



5-7. (Canceled) 



8. (Currently Amended) A machine readable storage medium readable by a computer, 
the medium having data stored thereon, the data, once read by the machine, causing the machine 
when executed by a processor of the computer, causes the processor to operate as an apparatus 
for determining, based on speech waveform data, a portion representing a feature of the speech 
waveform, said apparatus comprising: 

an acoustic/prosodic analysis unit which calculates, from said data, distribution of energy 
of a prescribed frequency range of said speech waveform along a time axis, and extracting, 
among various syllables, a first portion of said speech waveform that is generated stably by a 
source of said speech waveform, based on the distribution of energy and pitch of said speech 
waveform; 
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a cepstral analysis unit which calculates, from said data, a frequency spectrum 
distribution of said speech waveform along the time axis, and estimating, based on the frequency 
spectrum distribution, a second portion of said speech waveform, for which change is well 
controlled by said source; and 

a pseudo-syllabic center extracting unit which determines the portion representing a 
feature of said speech waveform based on the first portion extracted by the acoustic/prosodic 
analysis sonorant energy calculating unit and the second portion, wherein 
said cepstral analysis unit includes: 

a linear prediction analysis unit which performs linear prediction analysis on said 
speech waveform and outputting an estimated value of formant frequency; 

a cepstral distance calculating unit which calculates, using said data, a distribution 
of cepstral distance on the time axis based on the estimated value of formant frequency provided 
by said linear prediction analysis unit; 

an inter-frame variance calculating unit which calculates, based on an output from 
said linear prediction analysis unit, distribution of local variance of magnitude of delta cepstrum 
of said speech waveform on the time axis; and 

a reliability center candidate output unit which estimates, based both on said 
distribution of cepstral distance on the time axis based on the estimated value of formant 
frequency calculated by said cepstral distance calculating unit and on said distribution on the 
time axis of local variance of magnitude of delta cepstrum of said speech waveform calculated 
by said inter-frame variance calculating unit, a range in which change in the speech waveform is 
well controlled by the source. 
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9. (Currently Amended) The machine readable medium according to claim 8, wherein 
said acoustic/prosodic analysis unit includes: 

a pitch determining unit which determines, based on said data, whether each 
segment of said speech waveform is a voiced segment or not, 

a dip detecting unit which separates said speech waveform into syllables at a local 
minimum of said waveform of energy distribution of the prescribed frequency range of said 
speech waveform on the time axis; and 

a voiced/energy determining unit which extracts that range of said speech 
waveform which includes, in each syllable, an energy peak in that syllable within the segment 
determined to be a voiced segment by said pitch determining unit and in which the energy of the 
prescribed frequency range is not lower than a prescribed threshold value. 

10. (Canceled) 

1 1 . (Currently Amended) The machine readable medium according to claim 8, wherein 
said pseudo-syllabic center extracting unit includes: means for determining determines a 

range^ included in the range first portion of said speech waveform extracted by said 
acoustic/prosodic analysis unit, within the range of which change in speech waveform is 
estimated by said cepstral analysis unit to be well controlled by said source. 

12-13. (Canceled) 
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14. (Currently Amended) A method of extracting from a speech waveform data a 
portion representing a feature of the speech waveform, comprising the steps of: 

calculating, from said data, a distribution of energy of a prescribed frequency range of 
said speech waveform along a time axis, and extracting, among various syllables, a first portion 
of said speech waveform, that is generated stably by a source of said speech waveform, based on 
the distribution of energy and pitch of said speech waveform; 

calculating, from said data, a frequency spectrum distribution of said speech waveform 
along the time axis, and estimating, based on the frequency spectrum distribution, a second 
portion of said speech waveform, for which change is well controlled by said source; and 

extracting the portion representing a feature of said speech waveform based on the first 
portion extracted in s aid extracting st e p and the second portion, wherein 

said estimating step includes: 

performing linear prediction analysis on said speech waveform and outputting an 
estimated value of formant frequency; 

calculating, using said data, a distribution of cepstral distance on the time axis 
based on the estimated value of formant frequency provided in said step of outputting the 
estimated value; 

calculating, based on the calculated distribution based on the estimated value of 
formant frequency, distribution of local variance of magnitude of delta cepstrum of said speech 
waveform on the time axis; and 

estimating, based both on said calculated distribution of cepstral distance on the 
time axis related to the estimated value of formant frequency and on said calculated distribution 
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on the time axis of local variance of magnitude of delta cepstrum of said speech waveform, a 
range in which change in the speech waveform is well controlled by said source. 



15. (Previously Presented) The method according to claim 14, wherein 

said step of extracting a first portion of said speech waveform includes the steps of: 

determining, based on said data, whether each segment of said speech waveform 

is a voiced segment or not, 

detecting a local minimum of said waveform of energy distribution of the 

prescribed frequency range of said speech waveform on the time axis, and separating said speech 

waveform into syllables at the local minimum; and 

extracting that range of said speech waveform which includes, in each syllable, an 

energy peak in that syllable within a segment determined to be a voiced segment and in which 

the energy of the prescribed frequency range is not lower than a prescribed threshold value. 

16. (Canceled) 

17. (Currently Amended) The method according to claim 14, wherein 

said step of extracting the portion representing a feature of said speech waveform 
includes the step of: 



extracted in said extracting step first portion of said speech waveform , within the range of which 

change in said speech waveform is estimated in said estimating step to be well controlled by said 
source. 



determining r 




*7 a range,, included in the rang e 
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18-22. (Canceled) 

23. (Previously Presented) An apparatus as recited in claim 1, wherein 

said cepstral analysis unit is configured to calculate, from said data, a frequency spectrum 
distribution of said speech waveform along the time axis, and estimate the second portion, based 
on the frequency spectrum distribution, as a portion where local variance of changes of the 
frequency spectrum is at a local minimum. 

24. (Previously Presented) An apparatus as recited in claim 1, wherein 
said cepstral distance calculating unit includes: 

a cepstrum re-generating unit connected to receive said estimated value of 
formant frequency from said linear prediction analysis unit, for recalculating cepstrum 
coefficients based on said value of formant frequency; and 

a logarithmic transformation and inverse discrete cosine transformation unit 
connected to receive said speech waveform data for calculating FFT cepstrum coefficients based 
on said waveform data, wherein 

the cepstral distance calculating unit is configured to calculate cepstrum distance 
between the cepstrum coefficients recalculated by said cepstrum re-generating unit and the FFT 
cepstrum coefficients calculated by said a logarithmic transformation and inverse discrete cosine 
transformation unit, said cepstrum distance indicating a distribution of unreliability; and 
said cepstral analysis unit includes: 
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a standardizing and integrating unit which combines the cepstrum distance and 
the distribution on the time axis of local variance of spectral change and outputting a combined 
data, wherein 

the reliability center candidate output unit estimates the range in which change in 
the speech waveform is well controlled by said source at a dip of the combined data output by 
said standardizing and integrating unit. 

25. (Currently Amended) The machine readable medium according to claim 8, wherein 
said cepstral distance calculating unit includes: 

a cepstrum re-generating unit connected to receive said estimated value of 
formant frequency from said linear prediction analysis unit, for recalculating cepstrum 
coefficients based on said value of formant frequency; and 

a logarithmic transformation and inverse discrete cosine transformation unit 
connected to receive said speech waveform data for calculating FFT cepstrum coefficients based 
on said waveform data, wherein 

the cepstral distance calculating unit is configured to calculate cepstrum distance 
between the cepstrum coefficients recalculated by said cepstrum re-generating unit and the FFT 
cepstrum coefficients calculated by said a logarithmic transformation and inverse discrete cosine 
transformation unit, said cepstrum distance indicating a distribution of unreliability; and 
said cepstral analysis unit includes: 

a standardizing and integrating unit which combines the cepstrum distance and 
the distribution on the time axis of local variance of spectral change and outputting a combined 
data, wherein 
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the reliability center candidate output unit estimates the range in which change in 
the speech waveform is well controlled by said source at a dip of the combined data output by 
said standardizing and integrating unit. 

26. (Previously Presented) The method according to claim 14, wherein 
said step of calculating a distribution of energy includes: 

receiving said estimated value of formant frequency, and recalculating cepstrum 
coefficients based on said value of formant frequency; 

receiving said speech waveform data for calculating FFT cepstrum coefficients 
based on said waveform data; and 

calculating cepstrum distance between the recalculated cepstrum coefficients and 
the FFT cepstrum coefficients, said cepstrum distance indicating a distribution of unreliability; 
and wherein 

said estimating step further includes: 

combining the cepstrum distance and the distribution on the time axis of local 
variance of spectral change and outputting a combined data; and 

estimating the range in which change in the speech waveform is well controlled 
by said source at a dip of the combined data. 
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